New evidence for chunk-based models in word segmentation.

نویسندگان

Pierre Perruchet

Bénédicte Poulin-Charronnat

Barbara Tillmann

Ronald Peereman

چکیده

There is large evidence that infants are able to exploit statistical cues to discover the words of their language. However, how they proceed to do so is the object of enduring debates. The prevalent position is that words are extracted from the prior computation of statistics, in particular the transitional probabilities between syllables. As an alternative, chunk-based models posit that the sensitivity to statistics results from other processes, whereby many potential chunks are considered as candidate words, then selected as a function of their relevance. These two classes of models have proven to be difficult to dissociate. We propose here a procedure, which leads to contrasted predictions regarding the influence of a first language, L1, on the segmentation of a second language, L2. Simulations run with PARSER (Perruchet & Vinter, 1998), a chunk-based model, predict that when the words of L1 become word-external transitions of L2, learning of L2 should be depleted until reaching below chance level, at least before extensive exposure to L2 reverses the effect. In the same condition, a transitional-probability based model predicts above-chance performance whatever the duration of exposure to L2. PARSER's predictions were confirmed by experimental data: Performance on a two-alternative forced choice test between words and part-words from L2 was significantly below chance even though part-words were less cohesive in terms of transitional probabilities than words.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word segmentation in Persian continuous speech using F0 contour

Word segmentation in continuous speech is a complex cognitive process. Previous research on spoken word segmentation has revealed that in fixed-stress languages, listeners use acoustic cues to stress to de-segment speech into words. It has been further assumed that stress in non-final or non-initial position hinders the demarcative function of this prosodic factor. In Persian, stress is retract...

متن کامل

Using a Chunk-based Dependency Parser to Mine Compound Words from Tweets

New words are appearing everyday in online communication applications, such as Twitter1. Twitter is the world’s most famous online social networking and microblogging service that enables its users to send/read text-based messages of up to 140 characters, known as “tweets”. Due to the facts that tweets are online typed (as fast as possible) within a limited number of characters, tweets are full...

متن کامل

Phonological units in speech segmentation and phonological awareness

This paper explores the relationship between phonological units in speech segmentation and phonological awareness by investigating Japanese Brazilians living in Japan. The first experiment investigated the size of the phonological unit in speech segmentation using the Japanese materials and methodology in Otake et al. (1993). As for French subjects in the earlier study, the miss rates showed an...

متن کامل

Do Chinese Readers Follow the National Standard Rules for Word Segmentation during Reading?

We conducted a preliminary study to examine whether Chinese readers' spontaneous word segmentation processing is consistent with the national standard rules of word segmentation based on the Contemporary Chinese language word segmentation specification for information processing (CCLWSSIP). Participants were asked to segment Chinese sentences into individual words according to their prior knowl...

متن کامل

Building A Chinese Text Summarizer with Phrasal Chunks and Domain Knowledge

This paper introduces a Chinese summarizier called ThemePicker. Though the system incorporates both statistical and text analysis models, the statistical model plays a major role during the automated process. In addition to word segmentation and proper names identification, phrasal chunk extraction and content density calculation are based on a semantic network pre-constructed for a chosen doma...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Acta psychologica

دوره 149 شماره

صفحات -

تاریخ انتشار 2014

New evidence for chunk-based models in word segmentation.

نویسندگان

چکیده

منابع مشابه

Word segmentation in Persian continuous speech using F0 contour

Using a Chunk-based Dependency Parser to Mine Compound Words from Tweets

Phonological units in speech segmentation and phonological awareness

Do Chinese Readers Follow the National Standard Rules for Word Segmentation during Reading?

Building A Chinese Text Summarizer with Phrasal Chunks and Domain Knowledge

عنوان ژورنال:

اشتراک گذاری